DiscoverHuggingFace 每日AI论文速递2025.11.24 | 开源7B模型刷新多模态推理;GeoVista小模型精准地理定位
2025.11.24 | 开源7B模型刷新多模态推理;GeoVista小模型精准地理定位

2025.11.24 | 开源7B模型刷新多模态推理;GeoVista小模型精准地理定位

Update: 2025-11-24
Share

Description

本期的 15 篇论文如下:

[00:21 ] 🧠 OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe(OpenMMReasoner:以开放通用方案推动多模态推理前沿)

[01:04 ] 🌍 GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization(GeoVista:用于地理定位的Web增强智能视觉推理)

[01:41 ] 🎯 SAM 3: Segment Anything with Concepts(SAM 3:基于概念的通用分割模型)

[02:31 ] 📊 Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story(揭示文本的内在维度:从学术摘要到创意故事)

[03:09 ] 🧠 O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents(O-Mem:面向个性化、长周期、自进化智能体的全能记忆系统)

[03:43 ] 🦜 Parrot: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs(鹦鹉:输出真相的说服与一致性鲁棒性评级——一个面向大语言模型的谄媚鲁棒性基准)

[04:26 ] 🧠 RynnVLA-002: A Unified Vision-Language-Action and World Model(RynnVLA-002:统一的视觉-语言-动作与世界模型)

[05:19 ] 🧠 VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models(VisMem:潜在视觉记忆解锁视觉语言模型潜力)

[05:51 ] 🌍 WorldGen: From Text to Traversable and Interactive 3D Worlds(WorldGen:从文本到可遍历交互式3D世界)

[06:34 ] 🎨 Loomis Painter: Reconstructing the Painting Process(Loomis Painter:重建绘画过程)

[07:06 ] 🔮 Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight(Mantis:具有解耦视觉预测能力的多功能视觉-语言-动作模型)

[07:48 ] 🎨 InstructMix2Mix: Consistent Sparse-View Editing Through Multi-View Model Personalization(InstructMix2Mix:通过多视图模型个性化实现一致的稀疏视图编辑)

[08:21 ] 🔬 OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists(全能科学家:迈向人类与AI科学家共同进化的生态系统)

[09:07 ] 🧬 MergeDNA: Context-aware Genome Modeling with Dynamic Tokenization through Token Merging(MergeDNA:基于动态标记化的上下文感知基因组建模)

[09:41 ] 🔍 Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination(Video-R4:通过视觉反刍增强文本丰富视频推理)

<figure></figure>

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

2025.11.24 | 开源7B模型刷新多模态推理;GeoVista小模型精准地理定位

2025.11.24 | 开源7B模型刷新多模态推理;GeoVista小模型精准地理定位